From the PO.DAAC Cookbook, to access the GitHub version of the notebook, follow this link.

Working with SWOT Level 2 Water Mask Raster Image Data Product:

In AWS Cloud Version

Authors: Nicholas Tarpinian, PO.DAAC | Catalina Taglialatela (JPL, PO.DAAC)

Summary & Learning Objectives

Notebook showcasing how to work with multiple files from the SWOT Raster Image data product version 2.0 in the cloud

  • Utilizing the earthaccess Python package. For more information visit: https://nsidc.github.io/earthaccess/
  • Option to query the new dataset based on user’s choice; choosing between two resolutions either by ‘100m’ or ‘250m’.
  • Visualizing multiple raster images on a single map.
  • Stacking multiple raster images and creating a time dimension to analyze over time.
  • Adjusting images based on quality flag

Requirements

1. Compute environment

This tutorial is written to run in the following environment: - AWS instance running in us-west-2: NASA Earthdata Cloud data in S3 can be directly accessed via an s3fs session; this access is limited to requests made within the US West (Oregon) (code: us-west-2) AWS region. - This workflow as written works on a 14.8 GB RAM, upto 3.7 CPU cloud compute instance type. Smaller instances tent to crash.

2. Earthdata Login

An Earthdata Login account is required to access data, as well as discover restricted data, from the NASA Earthdata system. Thus, to access NASA data, you need Earthdata Login. Please visit https://urs.earthdata.nasa.gov to register and manage your Earthdata Login account. This account is free to create and only takes a moment to set up.

Import libraries

import io
import s3fs
import xarray as xr
import numpy as np
from datetime import datetime
from pathlib import Path
import hvplot
import hvplot.xarray 
import earthaccess

Authentication with earthaccess

In this notebook, we will be calling the authentication in the below cell.

auth = earthaccess.login()

Search for SWOT Raster products using earthaccess

Each dataset has its own unique collection concept ID. For the SWOT_L2_HR_Raster_2.0 dataset, we can find the collection ID here.

For this tutorial, we are looking at the Lake Mead Reservoir in the United States.

We used bbox finder to get the exact coordinates for our area of interest.

raster_results = earthaccess.search_data(
    short_name = 'SWOT_L2_HR_RASTER_2.0',
    bounding_box=(-115.112686,35.740939,-114.224167,36.937819),
    temporal =('2024-02-01 12:00:00', '2024-02-01 23:59:59'),
    granule_name = '*_100m_*',   #specify we are interested in the 100m standard raster
    count =200
)
Granules found: 2

Visualizing Multiple Tiles

Let’s now visualize multiple raster tiles that we searched and explore the data.

Utilizing xarray.open_mfdataset which supports the opening of multiple files.

ds = xr.open_mfdataset(earthaccess.open(raster_results), engine='h5netcdf',combine='nested', concat_dim='x')
ds
Opening 2 granules, approx size: 0.07 GB
using endpoint: https://archive.swot.podaac.earthdata.nasa.gov/s3credentials
<xarray.Dataset>
Dimensions:                  (y: 2784, x: 3074)
Coordinates:
  * y                        (y) float64 3.899e+06 3.899e+06 ... 4.177e+06
  * x                        (x) float64 5.391e+05 5.392e+05 ... 7.217e+05
Data variables: (12/39)
    crs                      (x) object b'1' b'1' b'1' b'1' ... b'1' b'1' b'1'
    longitude                (y, x) float64 dask.array<chunksize=(512, 513), meta=np.ndarray>
    latitude                 (y, x) float64 dask.array<chunksize=(512, 513), meta=np.ndarray>
    wse                      (y, x) float32 dask.array<chunksize=(768, 769), meta=np.ndarray>
    wse_qual                 (y, x) float32 dask.array<chunksize=(2784, 1538), meta=np.ndarray>
    wse_qual_bitwise         (y, x) float64 dask.array<chunksize=(768, 769), meta=np.ndarray>
    ...                       ...
    load_tide_fes            (y, x) float32 dask.array<chunksize=(768, 769), meta=np.ndarray>
    load_tide_got            (y, x) float32 dask.array<chunksize=(768, 769), meta=np.ndarray>
    pole_tide                (y, x) float32 dask.array<chunksize=(768, 769), meta=np.ndarray>
    model_dry_tropo_cor      (y, x) float32 dask.array<chunksize=(768, 769), meta=np.ndarray>
    model_wet_tropo_cor      (y, x) float32 dask.array<chunksize=(768, 769), meta=np.ndarray>
    iono_cor_gim_ka          (y, x) float32 dask.array<chunksize=(768, 769), meta=np.ndarray>
Attributes: (12/49)
    Conventions:                   CF-1.7
    title:                         Level 2 KaRIn High Rate Raster Data Product
    source:                        Ka-band radar interferometer
    history:                       2024-02-05T08:37:45Z : Creation
    platform:                      SWOT
    references:                    V1.2.1
    ...                            ...
    x_min:                         539100.0
    x_max:                         692800.0
    y_min:                         4023100.0
    y_max:                         4176900.0
    institution:                   CNES
    product_version:               01
    • y
      PandasIndex
      PandasIndex(Float64Index([3898600.0, 3898700.0, 3898800.0, 3898900.0, 3899000.0, 3899100.0,
                    3899200.0, 3899300.0, 3899400.0, 3899500.0,
                    ...
                    4176000.0, 4176100.0, 4176200.0, 4176300.0, 4176400.0, 4176500.0,
                    4176600.0, 4176700.0, 4176800.0, 4176900.0],
                   dtype='float64', name='y', length=2784))
    • x
      PandasIndex
      PandasIndex(Float64Index([539100.0, 539200.0, 539300.0, 539400.0, 539500.0, 539600.0,
                    539700.0, 539800.0, 539900.0, 540000.0,
                    ...
                    720800.0, 720900.0, 721000.0, 721100.0, 721200.0, 721300.0,
                    721400.0, 721500.0, 721600.0, 721700.0],
                   dtype='float64', name='x', length=3074))
  • Conventions :
    CF-1.7
    title :
    Level 2 KaRIn High Rate Raster Data Product
    source :
    Ka-band radar interferometer
    history :
    2024-02-05T08:37:45Z : Creation
    platform :
    SWOT
    references :
    V1.2.1
    reference_document :
    JPL D-56416 - Revision C - December 8, 2023
    contact :
    podaac@podaac.jpl.nasa.gov
    cycle_number :
    10
    pass_number :
    218
    scene_number :
    45
    tile_numbers :
    [88 89 90 91 88 89 90 91]
    tile_names :
    218_088L, 218_089L, 218_090L, 218_091L, 218_088R, 218_089R, 218_090R, 218_091R
    tile_polarizations :
    H, H, H, H, V, V, V, V
    coordinate_reference_system :
    Universal Transverse Mercator
    resolution :
    100.0
    short_name :
    L2_HR_Raster
    descriptor_string :
    100m_UTM11S_N_x_x_x
    crid :
    PIC0
    pge_name :
    PGE_L2_HR_RASTER
    pge_version :
    5.1.1
    time_granule_start :
    2024-02-01T18:38:14.266613Z
    time_granule_end :
    2024-02-01T18:38:35.362539Z
    time_coverage_start :
    2024-02-01T18:38:14.805141Z
    time_coverage_end :
    2024-02-01T18:38:34.816548Z
    geospatial_lon_min :
    -116.55823577002582
    geospatial_lon_max :
    -114.84459394683485
    geospatial_lat_min :
    36.350992807631336
    geospatial_lat_max :
    37.724496129136035
    left_first_longitude :
    -115.14314896157633
    left_first_latitude :
    37.724496129136035
    left_last_longitude :
    -114.84459394683485
    left_last_latitude :
    36.5951883116925
    right_first_longitude :
    -116.55823577002582
    right_first_latitude :
    37.474717676188774
    right_last_longitude :
    -116.24045666354958
    right_last_latitude :
    36.350992807631336
    xref_l2_hr_pixc_files :
    SWOT_L2_HR_PIXC_010_218_088L_20240201T183804_20240201T183815_PIC0_01.nc, SWOT_L2_HR_PIXC_010_218_089L_20240201T183814_20240201T183825_PIC0_01.nc, SWOT_L2_HR_PIXC_010_218_090L_20240201T183824_20240201T183835_PIC0_01.nc, SWOT_L2_HR_PIXC_010_218_091L_20240201T183834_20240201T183845_PIC0_01.nc, SWOT_L2_HR_PIXC_010_218_088R_20240201T183804_20240201T183815_PIC0_01.nc, SWOT_L2_HR_PIXC_010_218_089R_20240201T183814_20240201T183825_PIC0_01.nc, SWOT_L2_HR_PIXC_010_218_090R_20240201T183824_20240201T183835_PIC0_01.nc, SWOT_L2_HR_PIXC_010_218_091R_20240201T183834_20240201T183845_PIC0_01.nc
    xref_l2_hr_pixcvec_files :
    SWOT_L2_HR_PIXCVec_010_218_088L_20240201T183804_20240201T183815_PIC0_01.nc, SWOT_L2_HR_PIXCVec_010_218_089L_20240201T183814_20240201T183825_PIC0_01.nc, SWOT_L2_HR_PIXCVec_010_218_090L_20240201T183824_20240201T183835_PIC0_01.nc, SWOT_L2_HR_PIXCVec_010_218_091L_20240201T183834_20240201T183845_PIC0_01.nc, SWOT_L2_HR_PIXCVec_010_218_088R_20240201T183804_20240201T183815_PIC0_01.nc, SWOT_L2_HR_PIXCVec_010_218_089R_20240201T183814_20240201T183825_PIC0_01.nc, SWOT_L2_HR_PIXCVec_010_218_090R_20240201T183824_20240201T183835_PIC0_01.nc, SWOT_L2_HR_PIXCVec_010_218_091R_20240201T183834_20240201T183845_PIC0_01.nc
    xref_param_l2_hr_raster_file :
    SWOT_Param_L2_HR_Raster_20000101T000000_21000101T000000_20230817T100000_v302.rdf
    xref_reforbittrack_files :
    SWOT_RefOrbitTrackTileBoundary_Nom_20000101T000000_21000101T000000_20200617T193054_v101.txt, SWOT_RefOrbitTrack125mPass1_Nom_20000101T000000_21000101T000000_20200617T193054_v101.txt, SWOT_RefOrbitTrack125mPass2_Nom_20000101T000000_21000101T000000_20200617T193054_v101.txt
    utm_zone_num :
    11
    mgrs_latitude_band :
    S
    x_min :
    539100.0
    x_max :
    692800.0
    y_min :
    4023100.0
    y_max :
    4176900.0
    institution :
    CNES
    product_version :
    01
  • raster_plot = ds.wse.hvplot.quadmesh(x='x', y='y', rasterize=True, title=f'SWOT Raster 100m: Lake Mead Reservoir')
    raster_plot.opts(width=700, height=600, colorbar=True)

    Creating a Time Series

    SWOT Raster product does not include a time dimension, each file is a snapshot in time, but it can be inserted by extracting from the file name.

    1. Expand the time range of your earthaccess search to get an adequate range.
    2. Extract the datetime from the s3 file name then concatenate based on the new time dimension.
    time_results = earthaccess.search_data(
        short_name = 'SWOT_L2_HR_RASTER_2.0',
        bounding_box=(-114.502048,36.060175,-114.390983,36.210182),
        temporal =('2024-01-25 00:00:00', '2024-03-04 23:59:59'),
        granule_name = '*_100m_*',
        count =200
    )
    Granules found: 3
    fs_s3 = earthaccess.get_s3fs_session(results=time_results)
    # Get links list
    raster = []
    for g in time_results:
        for l in earthaccess.results.DataGranule.data_links(g, access='direct'):
            raster.append(l)
    
    print(len(raster))
    raster
    3
    ['s3://podaac-swot-ops-cumulus-protected/SWOT_L2_HR_Raster_2.0/SWOT_L2_HR_Raster_100m_UTM11S_N_x_x_x_010_205_109F_20240201T075048_20240201T075109_PIC0_01.nc',
     's3://podaac-swot-ops-cumulus-protected/SWOT_L2_HR_Raster_2.0/SWOT_L2_HR_Raster_100m_UTM11S_N_x_x_x_010_496_046F_20240211T170050_20240211T170111_PIC0_01.nc',
     's3://podaac-swot-ops-cumulus-protected/SWOT_L2_HR_Raster_2.0/SWOT_L2_HR_Raster_100m_UTM11S_N_x_x_x_011_205_109F_20240222T043554_20240222T043615_PIC0_01.nc']
    def add_time_dimension(ds, file_path):
        # Extract filename from s3 file path
        file_name = file_path.split('/')[-1]
        # Extract date/time string from filename
        date_str = file_name.split('_')[-4][:15]
        # Convert the date string to a datetime object
        time_value = datetime.strptime(date_str, "%Y%m%dT%H%M%S")
        # Assign the time coordinate to the dataset
        ds.coords['time'] = time_value
        return ds
    datasets = []
    file_names = []
    
    for file_path in raster:
        with fs_s3.open(file_path, 'rb') as file:
            file_bytes = file.read()
        file_obj = io.BytesIO(file_bytes)
        dataset = xr.open_dataset(file_obj, engine='h5netcdf')
        dataset_with_time = add_time_dimension(dataset, file_path)
        datasets.append(dataset_with_time)
        file_names.append(file_path.split('/')[-1])
        dataset.close()
    # sorting the time dimension in correct order
    datasets.sort(key=lambda ds: ds.time.values)
    ds2 = xr.concat(datasets, dim='time')
    ds2
    <xarray.Dataset>
    Dimensions:                  (x: 1549, y: 1549, time: 3)
    Coordinates:
      * x                        (x) float64 6.788e+05 6.789e+05 ... 8.336e+05
      * y                        (y) float64 3.9e+06 3.901e+06 ... 4.055e+06
      * time                     (time) datetime64[ns] 2024-02-01T07:50:48 ... 20...
    Data variables: (12/39)
        crs                      (time) object b'1' b'1' b'1'
        longitude                (time, y, x) float64 nan nan nan ... nan nan nan
        latitude                 (time, y, x) float64 nan nan nan ... nan nan nan
        wse                      (time, y, x) float32 nan nan nan ... nan nan nan
        wse_qual                 (time, y, x) float32 nan nan nan ... nan nan nan
        wse_qual_bitwise         (time, y, x) float64 nan nan nan ... nan nan nan
        ...                       ...
        load_tide_fes            (time, y, x) float32 nan nan nan ... nan nan nan
        load_tide_got            (time, y, x) float32 nan nan nan ... nan nan nan
        pole_tide                (time, y, x) float32 nan nan nan ... nan nan nan
        model_dry_tropo_cor      (time, y, x) float32 nan nan nan ... nan nan nan
        model_wet_tropo_cor      (time, y, x) float32 nan nan nan ... nan nan nan
        iono_cor_gim_ka          (time, y, x) float32 nan nan nan ... nan nan nan
    Attributes: (12/49)
        Conventions:                   CF-1.7
        title:                         Level 2 KaRIn High Rate Raster Data Product
        source:                        Ka-band radar interferometer
        history:                       2024-02-05T12:55:01Z : Creation
        platform:                      SWOT
        references:                    V1.2.1
        ...                            ...
        x_min:                         680100.0
        x_max:                         829300.0
        y_min:                         3903300.0
        y_max:                         4052400.0
        institution:                   CNES
        product_version:               01
    timeplot = ds2.wse.hvplot.image(y='y', x='x')
    timeplot.opts(width=700, height=500, colorbar=True)

    Let’s plot the wse quality flag, wse_qual which ranges 0-3 where 0=good, 1=suspect, 2=degraded, 3=bad (as described when printing variable with xarray).

    timeplot = ds2.wse_qual.hvplot.image(y='y', x='x')
    timeplot.opts(width=700, height=500, colorbar=True)

    Masking a variable with its quaility flag

    variable_to_mask = ds2['wse']
    mask_variable = ds2['wse_qual']
    # Define the condition for masking based on the range of the quaility flag
    mask_condition = mask_variable <2
    
    masked_variable = variable_to_mask.where(mask_condition)
    # Update the masked variable in the dataset
    ds2['wse'] = masked_variable
    
    ds2['wse'].hvplot.image(y='y', x='x').opts(width=700, height=500, colorbar=True)

    Our end product is a time series of the data showing only the values where the quality flag is either good (0) or suspect (1).

    Appendix: Alternate Plot

    # # Alternate plotting with matplotlib
    # %matplotlib inline
    
    # import matplotlib.pyplot as plt
    # from matplotlib import animation
    # from matplotlib.animation import FuncAnimation, PillowWriter
    # from IPython.display import display, Image, HTML
    
    # variable_name = 'wse'
    # data = ds2[variable_name]
    
    # fig, ax = plt.subplots(figsize=(10, 8))
    # fig.set_tight_layout({'rect': [0.01, 0.01, 1.0, 1.0]})
    
    # contour = ax.contourf(data.isel(time=0), cmap='viridis')
    # cbar = plt.colorbar(contour)
    # cbar.set_label('Water Surface Elevation (meters)', fontsize=14) 
    # times = ds2.time.values
    
    # # Function to update the plot for each time step
    # def update(frame):
    #     ax.clear()
    #     contour = ax.contourf(data.isel(time=frame), cmap='viridis')
    #     formatted_time = str(times[frame])[:-7]
    #     ax.set_title(f'Date: {formatted_time}')
    #     ax.set_xlabel('Longitude', fontsize=14)
    #     ax.set_ylabel('Latitude', fontsize=14)
    #     ax.text(0.5, 1.05, 'SWOT Raster 100M Lake Mead Reservoir', transform=ax.transAxes, ha='center', fontsize=14)
    #     return contour,
    
    # # Creating a gif animation
    # ani = animation.FuncAnimation(fig, update, repeat=True, frames=len(data['time']), blit=True, interval=3000)
    
    # output = ('time_series.gif')
    # ani.save(output, writer='pillow', fps=.5)
    
    # with open(output,'rb') as f:
    #     display(Image(data=f.read(), format='gif'))
    
    # plt.close(fig)
    # ds2.close()